Safe Policy Search with Gaussian Process Models
نویسندگان
چکیده
We propose a method to optimise the parameters of a policy which will be used to safely perform a given task in a data-efficient manner. We train a Gaussian process model to capture the system dynamics, based on the PILCO framework. Our model has useful analytic properties, which allow closed form computation of error gradients and estimating the probability of violating given state space constraints. During training, as well as operation, only policies that are deemed safe are implemented on the real system, minimising the risk of failure.
منابع مشابه
Probabilistic Differential Dynamic Programming
We present a data-driven, probabilistic trajectory optimization framework for systems with unknown dynamics, called Probabilistic Differential Dynamic Programming (PDDP). PDDP takes into account uncertainty explicitly for dynamics models using Gaussian processes (GPs). Based on the second-order local approximation of the value function, PDDP performs Dynamic Programming around a nominal traject...
متن کاملSafe Exploration for Active Learning with Gaussian Processes
In this paper, the problem of safe exploration in the active learning context is considered. Safe exploration is especially important for data sampling from technical and industrial systems, e.g. combustion engines and gas turbines, where critical and unsafe measurements need to be avoided. The objective is to learn data-based regression models from such technical systems using a limited budget...
متن کاملVariational Bayesian Optimization for Runtime Risk-Sensitive Control
We present a new Bayesian policy search algorithm suitable for problems with policy-dependent cost variance, a property present in many robot control tasks. We extend recent work on variational heteroscedastic Gaussian processes to the optimization case to achieve efficient minimization of very noisy cost signals. In contrast to most policy search algorithms, our method explicitly models the co...
متن کاملGovernance: Blending Bureaucratic Rules with Day to Day Operational Realities; Comment on “Governance, Government, and the Search for New Provider Models”
Richard Saltman and Antonio Duran take up the challenging issue of governance in their article “Governance, Government and the Search for New Provider Models,” and use two case studies of health policy changes in Sweden and Spain to shed light on the subject. In this commentary, I seek to link their conceptualization of governance, especially its interrelated roles at the macro, meso, and micro...
متن کاملGrid-search event location with non-Gaussian error models
Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Abstract This study employs an event location algorithm based on grid search to investigate the possibility of impr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1712.05556 شماره
صفحات -
تاریخ انتشار 2017